Hectiling: An Integration of Fine and Coarse-Grained Load-Balancing Strategies

نویسندگان

  • Samuel H. Russ
  • Ioana Banicescu
  • Sheikh Ghafoor
  • Bharathi Janapareddi
  • Jonathan Robinson
  • Rong Lu
چکیده

Abstract –– General–purpose programmers have come to expect a high degree of portability among widely varying architectures. Advances in run–time systems for parallel programs have been proposed in order to harness available resources as efficiently as possible. Simultaneously, advances in algorithmic ways of dynamically balancing computational load have been proposed in order to respond to variations in actual performance and therefore in runtime. The primary mechanism for harnessing idle resources effectively, task migration, can be used alongside the primary mechanism for dynamic load balancing, data redistribution. Besides the fact that the two methods can be used simultaneously to spur further increases in performance, the run–time information–gathering infrastructure necessary to detect and use idle resources can also benefit dynamically load–balanced applications. This paper describes an architecture for and preliminary implementation of a system that combines data–parallel load–balancing with task–parallel load–balancing. Performance test results are included as well.– General–purpose programmers have come to expect a high degree of portability among widely varying architectures. Advances in run–time systems for parallel programs have been proposed in order to harness available resources as efficiently as possible. Simultaneously, advances in algorithmic ways of dynamically balancing computational load have been proposed in order to respond to variations in actual performance and therefore in runtime. The primary mechanism for harnessing idle resources effectively, task migration, can be used alongside the primary mechanism for dynamic load balancing, data redistribution. Besides the fact that the two methods can be used simultaneously to spur further increases in performance, the run–time information–gathering infrastructure necessary to detect and use idle resources can also benefit dynamically load–balanced applications. This paper describes an architecture for and preliminary implementation of a system that combines data–parallel load–balancing with task–parallel load–balancing. Performance test results are included as well. 1: Introduction and Survey of Existing Systems One of the responsibilities of a parallel program and/or run–time system is that of load–balancing. Individual processors may vary in performance, external workload, or data distribution, and so methods to maintain an even distribution of work are usually needed to obtain good performance and speedup. One can consider a parallel program as consisting of p threads of execution and q data partitions. In general, p does not have to equal q (although this is usually the case). To maintain a balanced load, the threads can be moved, the data in the partitions can be redistributed, or the two (matched pairs of threads and data) can be moved together. These three cases can be called “thread migration”, “data migration”, and “task migration”, respectively. This paper considers the network–of–workstation environment, and in the NOW environment there is an additional reason to migrate work. It is often desirable to return a workstation back to its “owner” and release the CPU and memory resources. Thus “release of resources” is an additional reason to provide and use task migration. This is an important distinction, as task–migration–based load–balancing systems can perform task migrations for reasons unrelated to load–balancing, and data–migration and thread–migration–based systems may also provide task migration services. The three styles of migration (and of programming) are characterized below.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Competitive Resource Management in Distributed Computing Environments with Hectiling

Resource management for scientiic applications in distributed computing environments is a complex problem. Over time, various techniques to manage resources at either coarse or ne levels of granularity in a network of workstations (NOW) have been proposed. A technique which has proven to be highly eeective at the ner level of granularity is Fractiling. It is a dynamic scheduling scheme, which a...

متن کامل

Experiences from integrating algorithmic and systemic load balancing strategies

Load balancing increases the eÆcient use of existing resources for parallel and distributed applications. At a coarse level of granularity, advances in runtime systems for parallel programs have been proposed in order to control available resources as eÆciently as possible by utilizing idle resources and using task migration. Simultaneously, at a ner granularity level, advances in algorithmic s...

متن کامل

Parallel branch and bound on fine-grained hypercube multiprocessors

In this paper, we study parallel branch and bound on fine grained hypercube multiprocessors. Each processor in a fine grained system has only a very small amount of memory available, Therefore, current parallel branch and bound methods for coarse grained systems (s 1000 nodes) can not be applied, since all these methods assume that every processor stores the path from the node it is currently p...

متن کامل

Load balancing and OpenMP implementation of nested parallelism

Many problems have multiple layers of parallelism. The outer-level may consist of few and coarse-grained tasks. Next, each of these tasks may also be rich in parallelism, and be split into a number of fine-grained tasks, which again may consist of even finer subtasks, and so on. Here we argue and demonstrate by examples that utilizing multiple layers of parallelism may give much better scaling ...

متن کامل

Load Balancing Factor (lbf): a Workload Migration Metric

We introduce a new performance metric, called Load Balancing Factor (LBF), to evaluate different tuning alternatives of workload migration within a distributed/parallel program. The metric is unique because it shows the performance implications of a specific tuning alternative rather than quantifying where time is spent in the program. Previously we developed a variation of the metric for coars...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998